OntoDNA: Ontology Alignment Results for OAEI 2007
نویسندگان
چکیده
OntoDNA is an automated ontology mapping and merging system that utilizes unsupervised data mining methods, comprising of Formal Concept analysis (FCA), Self-Organizing map (SOM) and K-means incorporated with lexical similarity, namely Levenshtein edit distance. The unsupervised data mining methods are used to resolve structural and semantic heterogeneities between ontologies, meanwhile lexical similarity is used to resolve lexical heterogeneity between ontologies. OntoDNA generates a merged ontology in concept lattice that enables visualization of the concept space based on formal context. This paper briefly describes the OntoDNA system and discusses the obtained alignment results on some of the OAEI 2007 dataset. The paper also presents strengths and weaknesses of our system and the method to improve the current approach. 1 Presentation of the system 1.1 State, purpose, general statement OntoDNA is an automated ontology mapping and merging tool that provides a scalable environment for interoperating ontologies between information sources. OntoDNA aims to offer contextual and robust ontology mapping and merging through hybrid unsupervised clustering techniques, which comprises of Formal Concept Analysis (FCA) [1], Self-Organizing Map (SOM) and K-Means clustering [2] incorporated with a lexical measurement, Levenshtein edit distance [3]. OntoDNA generates a merged ontology in concept lattice form that enables visualization of the concept space based on formal context. 1.2 Specific techniques used Ontology is formalized as a tuple O: = (C, SC, P, SP, A, I), where C is concepts of ontology and SC corresponds to the hierarchy of concepts. The relationship between the concepts is defined by properties of ontology, P whereas SP corresponds to the hierarchy of properties. A refers to axioms used to infer knowledge from existing knowledge and I instances of concept [4]. The OntoDNA resolves heterogeneous ontologies by capturing ontological concepts (C) and its ontological elements (SC, P, SP, A) [5]. The OntoDNA utilizes FCA to capture the properties and the inherent structural relationships among ontological concepts of heterogeneous ontologies. The captured structures of ontological concepts act as background knowledge to resolve semantic interpretations in similar (synonymy) or different contexts (polysemy). The unsupervised clustering techniques, Self-Organizing Map (SOM) and KMeans are used to overcome the absence of prior knowledge to discover the structural and semantic heterogeneities between ontologies. SOM organizes ontological elements, clustering more similar ontological concepts together. The clusters of the ontological concepts are derived from the natural characteristics of the ontological elements. Meanwhile K-Means is used to reduce the problem size of the SOM map for efficient semantic heterogeneous discovery in different contexts. The OntoDNA relies on lexical similarity to resolve lexical heterogeneity by both ontological concept and property names. The lexical similarity, Levenshtein edit distance with the threshold value 0.8 [5] is applied to discover lexical similarity. Prior to the discovery of the degree of lexical similarity, linguistic processing such as case normalization, blank normalization, digit normalization, namespace prefixes elimination, link stripping, and stopword filtering are applied to normalize ontological elements. The OntoDNA automated ontology mapping and merging framework is depicted in Figure 1. The terms used in the OntoDNA framework are defined as follows: − Source ontology OS: Source ontology is the local data repository ontology − Target ontology OT: Target ontology refers to non-local data repository ontology − Formal context KS and KT: Formal context KS is the formal context representation of the conceptual relationship of the source ontology OS, meanwhile formal context KT is the formal context representation of the conceptual relationship of the target ontology OT. − Reconciled formal context RKS and RKT: Reconciled formal context RKS and RKT are formal context with normalized intents of source and target ontological concepts’ properties. − The ontological elements O : = (C, SC, P, SP, A): C is concepts of ontology and SC corresponds to the hierarchy of concepts. P is properties of ontology, and SP corresponds to the hierarchy of properties. A refers to axioms. s t e t z h a x x x x b x x x f x x g x x x x s f t n a x x b x x x c x d x x s f t n a x x b x x x c x d x x s f t n a x x b x x x c x d x x s f t n a x x b x x x c x d x x Figure 1. OntoDNA’s framework The OntoDNA algorithmic framework implementation of the automated mapping and merging process illustrated in Figure 1 is explicated below [5] [6]: Input : Two ontologies that are to be merged, OS (source ontology) and OT (target ontology). Step 1 : Ontological contextualization The conceptual pattern of OS and OT is discovered using FCA. Given an ontology O : = (C, SC, P, SP, A), OS and OT are contextualized using FCA with respect to the formal context, KS and KT. The ontological concepts C are denoted as G (objects) and the rest of the ontology elements, SC, P, SP and A are denoted as M (attributes). The binary relation I ⊆ G x M of the formal context denotes the ontology elements, SC, P, SP and A corresponding to the ontological concepts C. Step 2 : Pre-linguistic processing String normalizations are applied to transform attributes in KS and KT prior to lexical similarity mapping. The mapping rules (Map_Rule 1 and Map_Rule 2) (Table 1) are applied to reconcile intents in KS and KT. The reconciled formal context RKS and RKT are output as input for semantic similarity discovery in the next step. Step 3 : Contextual clustering SOM and k-means are applied for semantic similarity mapping based on the conceptual pattern discovered in the formal context. First, the formal context RKT is trained by SOM. This is followed by k-means clustering to reduce the problem size of the SOM clusters as validated by the Davies-Bouldin index. Subsequently, the formal concepts RKS are fed to the trained SOM. The source ontological concepts are assigned to the same cluster as their Best Matching Units (BMUs) in the target ontology. Step 4 : Post-linguistic processing The mapping rules (Map_Rule 1 and Map_Rule 2) (Table 1) are applied to discover semantic similarity between ontological concepts in the clusters. The ontological concepts of the target ontology are updated to the source ontology based on merging rules (Merge_Rule 1 and Merge_Rule 2) (Table 1). Output : Merged ontology in a concept lattice is formed. Mapping Rules Given source ontological element OelementSi and target ontological element OelementTj, apply lexical similarity measure (LSM) to map the target ontology OT to the source ontology OS at threshold value, t, where elements i and j = 1, 2, 3, ..., n. Map_Rule 1: map (OelementTj OelementSi), if LSM(OelementSi, OelementTj) ≥ t; the target ontological element, OelementTj is mapped to (integrated with) the source ontological element, OelementSi and the naming convention and structure of the source ontological element, OelementSi are preserved. Map_Rule 2: merge (OelementTj OS), if LSM(OelementSi, OelementTj) < t; the target ontological element, OelementTj is merged (appended) to the source ontology and the naming convention and structure of the target ontological element, OelementTj are preserved. Merging Rules Given the source ontology OS in a reconciled formal context k = (G, M, I) and target ontology OT in a reconciled formal context l = (H, N, J). The source ontology is the base for ontology merging. Merge_Rule 1: If Map_Rule 1 or Map_Rule 3 is true, the intents of OelementTj (ontological concepts) and its object-attribute relationship J ⊆ H x N is aligned (appended) into formal context k. Merge_Rule 2: If Map_Rule 2 is true, and formal context k is defined by (ΟextentS1, ΟintentS1) ≤ (ΟextentS2, ΟintentS2) :⇔ ΟextentS1 ⊆ ΟextentS2 (⇔ΟintentS1 ⊆ ΟintentS2) the intents of OelementTj, its object-attribute relationship J ⊆ H x N and its subconcept superconcept relation of OelementTj among other concepts are aligned into formal context k, whereas the structural relationships of the appended concept is updated with the target ontology as the base. Table 1. Ontology mapping and merging rules 1.3 Adaptations made for the evaluation There is no special adaptation for the tests in the Ontology Alignment Evaluation Initiative (OAEI) 2007 campaign. However, a small program is written to translate our native alignment format in the form that is required by the OAEI contest. The URI for benchmark ontology 302 has been manually replaced in order to output the alignment file. 1.4 Link to the system, parameters file and to the set of provided alignments The OntoDNA system and the alignment results in a ZIP file organized as presented can be downloaded from http://pesona.mmu.edu.my/~cckiu/OAEI2007.htm.
منابع مشابه
Results of the Ontology Alignment Evaluation Initiative 2007
Ontology matching consists of finding correspondences between ontology entities. OAEI campaigns aim at comparing ontology matching systems on precisely defined test sets. Test sets can use ontologies of different nature (from expressive OWL ontologies to simple directories) and use different modalities (e.g., blind evaluation, open evaluation, consensus). OAEI-2007 builds over previous campaign...
متن کاملThe OAEI food task: An analysis of a thesaurus alignment task
This paper describes the “food task” of the Ontology Alignment Evaluation Initiative (OAEI) 2006 and 2007. The OAEI** is a comparative evaluation effort to measure the quality of automatic ontology-alignment systems. The food task focuses on the alignment of thesauri in the agricultural domain. It aims at providing a realistic task for ontology-alignment systems by which the relative performanc...
متن کاملTaxoMap in the OAEI 2007 Alignment Contest
This paper presents our first participation in the OAEI 2007 campaign. It describes an approach to align taxonomies which relies on terminological and structural techniques applied sequentially. We performed our method with various taxonomies using our prototype, TaxoMap. Previous experimental results were encouraging and demonstrate the relevance of this alignment approach. In this paper, we e...
متن کاملLily results on SEALS platform for OAEI 2011
This paper presents the alignment results of Lily on SEALS platform for the ontology alignment contest OAEI 2011. Lily is an ontology matching system. In OAEI 2011, Lily submited the results for three matching tasks on the SEALS platform: benchmark, anatomy, conference. The specific techniques used by Lily are introduced. The matching results of Lily are also discussed.
متن کاملResults of the Ontology Alignment Evaluation Initiative 2006
We present the Ontology Alignment Evaluation Initiative 2006 campaign as well as its results. The OAEI campaign aims at comparing ontology matching systems on precisely defined test sets. OAEI-2006 built over previous campaigns by having 6 tracks followed by 10 participants. It shows clear improvements over previous results. The final and official results of the campaign are those published on ...
متن کامل